Using Selective-Sampling Simulations in Poker
نویسندگان
چکیده
Until recently, AI research that used games as an experimental testbed has concentrated on perfect information games. Many of these games have been amenable to so-called brute-force search techniques. In contrast, games of imperfect information, such as bridge and poker, contain hidden knowledge making similar search techniques impractical. This paper describes work being done on developing a world-class poker-playing program. Part of the program’s playing strength comes from real-time simulations. The program generates an instance of the missing data, subject to any constraints that have been learned, and then searches the game tree to determine a numerical result. By repeating this a sufficient number of times, a statistically meaningful sample can be obtained to be used in the program’s decision–making process. For constructing programs to play two-player deterministic perfect information games, there is a well-defined framework based on the alpha-beta search algorithm. For imperfect information games, no comparable framework exists. In this paper we propose selective sampling simulations as a general-purpose framework for building programs to achieve high performance in imperfect
منابع مشابه
MCRNR: Fast Computing of Restricted Nash Responses by Means of Sampling
This paper presents a sample-based algorithm for the computation of restricted Nash strategies in complex extensive form games. Recent work indicates that regret-minimization algorithms using selective sampling, such as Monte-Carlo Counterfactual Regret Minimization (MCCFR), converge faster to Nash equilibrium (NE) strategies than their non-sampled counterparts which perform a full tree travers...
متن کاملLearning a Value Analysis Tool for Agent Evaluation
Evaluating an agent’s performance in a stochastic setting is necessary for agent development, scientific evaluation, and competitions. Traditionally, evaluation is done using Monte Carlo estimation; the magnitude of the stochasticity in the domain or the high cost of sampling, however, can often prevent the approach from resulting in statistically significant conclusions. Recently, an advantage...
متن کاملMonte Carlo Sampling for Regret Minimization in Extensive Games
Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome samplin...
متن کاملUsing Probabilistic Knowledge and Simulation to Play Poker
Until recently, artificial intelligence researchers who use games as their experimental testbed have concentrated on games of perfect information. Many of these games have been amenable to brute-force search techniques. In contrast, games of imperfect information, such as bridge and poker, contain hidden information making similar search techniques impractical. This paper describes recent progr...
متن کاملBeyond Chance? The Persistence of Performance in Online Poker
A major issue in the widespread controversy about the legality of poker and the appropriate taxation of winnings is whether poker should be considered a game of skill or a game of chance. To inform this debate we present an analysis into the role of skill in the performance of online poker players, using a large database with hundreds of millions of player-hand observations from real money ring...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999